Overview

Dataset statistics

Number of variables14
Number of observations786600
Missing cells24767
Missing cells (%)0.2%
Duplicate rows546
Duplicate rows (%)0.1%
Total size in memory90.0 MiB
Average record size in memory120.0 B

Variable types

NUM10
BOOL2
CAT2

Warnings

Dataset has 546 (0.1%) duplicate rows Duplicates
customer_id has a high cardinality: 245455 distinct values High cardinality
order_date has a high cardinality: 776 distinct values High cardinality
customer_order_rank has 24767 (3.1%) missing values Missing
voucher_amount is highly skewed (γ1 = 30.39394065) Skewed
platform_id is highly skewed (γ1 = -22.53663783) Skewed
voucher_amount has 743462 (94.5%) zeros Zeros
delivery_fee has 597536 (76.0%) zeros Zeros

Reproduction

Analysis started2020-10-11 18:10:25.002419
Analysis finished2020-10-11 18:13:23.872274
Duration2 minutes and 58.87 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

customer_id
Categorical

HIGH CARDINALITY

Distinct245455
Distinct (%)31.2%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
15edce943edd
 
386
8745a335e9cf
 
288
d956116d863d
 
286
0063666607bb
 
273
ae60dce05485
 
270
Other values (245450)
785097 
ValueCountFrequency (%) 
15edce943edd386< 0.1%
 
8745a335e9cf288< 0.1%
 
d956116d863d286< 0.1%
 
0063666607bb273< 0.1%
 
ae60dce05485270< 0.1%
 
a54a8e1579d4254< 0.1%
 
bebb751d49b8253< 0.1%
 
26ed6389a3aa245< 0.1%
 
ef6265f74aca229< 0.1%
 
a333fb175a0c221< 0.1%
 
Other values (245445)78389599.7%
 
2020-10-11T21:13:26.283447image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique145498 ?
Unique (%)18.5%
2020-10-11T21:13:26.674687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length12
Mean length12
Min length12

order_date
Categorical

HIGH CARDINALITY

Distinct776
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
2017-01-01
 
4230
2016-12-18
 
3395
2017-02-26
 
3234
2017-02-05
 
3218
2017-02-12
 
3125
Other values (771)
769398 
ValueCountFrequency (%) 
2017-01-0142300.5%
 
2016-12-1833950.4%
 
2017-02-2632340.4%
 
2017-02-0532180.4%
 
2017-02-1231250.4%
 
2016-12-1131000.4%
 
2016-12-0430750.4%
 
2017-01-2230050.4%
 
2017-01-2930030.4%
 
2016-10-0329990.4%
 
Other values (766)75421695.9%
 
2020-10-11T21:13:27.097070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique41 ?
Unique (%)< 0.1%
2020-10-11T21:13:27.557294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

order_hour
Real number (ℝ≥0)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.58879608
Minimum0
Maximum23
Zeros4627
Zeros (%)0.6%
Memory size6.0 MiB
2020-10-11T21:13:27.910719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q116
median18
Q320
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.357192477
Coefficient of variation (CV)0.1908710785
Kurtosis5.749711941
Mean17.58879608
Median Absolute Deviation (MAD)2
Skewness-1.749088644
Sum13835347
Variance11.27074133
MonotocityNot monotonic
2020-10-11T21:13:28.293108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%) 
1913403017.0%
 
1812965416.5%
 
2010873913.8%
 
179078211.5%
 
21682238.7%
 
16488776.2%
 
15342864.4%
 
22334034.2%
 
13311054.0%
 
14303233.9%
 
Other values (14)771789.8%
 
ValueCountFrequency (%) 
046270.6%
 
124250.3%
 
211870.2%
 
34430.1%
 
4137< 0.1%
 
ValueCountFrequency (%) 
23138321.8%
 
22334034.2%
 
21682238.7%
 
2010873913.8%
 
1913403017.0%
 

customer_order_rank
Real number (ℝ≥0)

MISSING

Distinct369
Distinct (%)< 0.1%
Missing24767
Missing (%)3.1%
Infinite0
Infinite (%)0.0%
Mean9.436809642
Minimum1
Maximum369
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-11T21:13:28.689672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q310
95-th percentile39
Maximum369
Range368
Interquartile range (IQR)9

Descriptive statistics

Standard deviation17.77232218
Coefficient of variation (CV)1.88329773
Kurtosis49.04720204
Mean9.436809642
Median Absolute Deviation (MAD)2
Skewness5.494014541
Sum7189273
Variance315.8554356
MonotocityNot monotonic
2020-10-11T21:13:29.112450image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
6276033.5%
 
7230492.9%
 
8196962.5%
 
9170132.2%
 
10148891.9%
 
Other values (359)17975622.9%
 
(Missing)247673.1%
 
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
ValueCountFrequency (%) 
3691< 0.1%
 
3681< 0.1%
 
3671< 0.1%
 
3661< 0.1%
 
3651< 0.1%
 

is_failed
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
0
761833 
1
 
24767
ValueCountFrequency (%) 
076183396.9%
 
1247673.1%
 
2020-10-11T21:13:29.420029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

voucher_amount
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct911
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09148909292
Minimum0
Maximum93.3989
Zeros743462
Zeros (%)94.5%
Memory size6.0 MiB
2020-10-11T21:13:30.244657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.686
Maximum93.3989
Range93.3989
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4795579176
Coefficient of variation (CV)5.241694963
Kurtosis3886.352852
Mean0.09148909292
Median Absolute Deviation (MAD)0
Skewness30.39394065
Sum71965.32049
Variance0.2299757963
MonotocityNot monotonic
2020-10-11T21:13:30.762249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
074346294.5%
 
1.029116471.5%
 
1.715111341.4%
 
2.05891221.2%
 
0.68636480.5%
 
1.37217700.2%
 
2.74411920.2%
 
2.57258970.1%
 
3.435430.1%
 
0.5145373< 0.1%
 
Other values (901)28120.4%
 
ValueCountFrequency (%) 
074346294.5%
 
0.0034335< 0.1%
 
0.284691< 0.1%
 
0.322421< 0.1%
 
0.34319< 0.1%
 
ValueCountFrequency (%) 
93.39891< 0.1%
 
78.029071< 0.1%
 
68.39421< 0.1%
 
61.825751< 0.1%
 
37.575651< 0.1%
 

delivery_fee
Real number (ℝ≥0)

ZEROS

Distinct98
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1811799318
Minimum0
Maximum9.86
Zeros597536
Zeros (%)76.0%
Memory size6.0 MiB
2020-10-11T21:13:31.257215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.986
Maximum9.86
Range9.86
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3697095668
Coefficient of variation (CV)2.040565769
Kurtosis8.481347092
Mean0.1811799318
Median Absolute Deviation (MAD)0
Skewness2.417459196
Sum142516.1343
Variance0.1366851638
MonotocityNot monotonic
2020-10-11T21:13:31.664628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
059753676.0%
 
0.493706179.0%
 
0.986357354.5%
 
0.7395347904.4%
 
0.246576641.0%
 
1.232571640.9%
 
1.47967680.9%
 
1.429750780.6%
 
0.4683530970.4%
 
0.443726570.3%
 
Other values (88)154942.0%
 
ValueCountFrequency (%) 
059753676.0%
 
0.0246510< 0.1%
 
0.04933< 0.1%
 
0.09864< 0.1%
 
0.1479303< 0.1%
 
ValueCountFrequency (%) 
9.861< 0.1%
 
7.3951< 0.1%
 
6.65551< 0.1%
 
6.4091< 0.1%
 
5.9161< 0.1%
 

amount_paid
Real number (ℝ≥0)

Distinct6471
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.18327131
Minimum0
Maximum1131.03
Zeros872
Zeros (%)0.1%
Memory size6.0 MiB
2020-10-11T21:13:32.017186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.5135
Q16.64812
median9.027
Q312.213
95-th percentile19.5408
Maximum1131.03
Range1131.03
Interquartile range (IQR)5.56488

Descriptive statistics

Standard deviation5.6181212
Coefficient of variation (CV)0.5517010233
Kurtosis2243.912588
Mean10.18327131
Median Absolute Deviation (MAD)2.655
Skewness15.5881411
Sum8010161.21
Variance31.56328582
MonotocityNot monotonic
2020-10-11T21:13:32.433202image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5.31146671.9%
 
7.965144101.8%
 
6.372118781.5%
 
8.496103501.3%
 
6.90399881.3%
 
5.84197341.2%
 
9.02792131.2%
 
7.43491561.2%
 
10.6289821.1%
 
9.55883771.1%
 
Other values (6461)67984586.4%
 
ValueCountFrequency (%) 
08720.1%
 
0.005311< 0.1%
 
0.015931< 0.1%
 
0.026551< 0.1%
 
0.037171< 0.1%
 
ValueCountFrequency (%) 
1131.031< 0.1%
 
581.71051< 0.1%
 
363.018151< 0.1%
 
353.38051< 0.1%
 
246.888451< 0.1%
 

restaurant_id
Real number (ℝ≥0)

Distinct13569
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162864079.3
Minimum73498
Maximum340453498
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-11T21:13:32.890401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum73498
5-th percentile29803498
Q186023498
median169613498
Q3228433498
95-th percentile302393498
Maximum340453498
Range340380000
Interquartile range (IQR)142410000

Descriptive statistics

Standard deviation87830821.23
Coefficient of variation (CV)0.5392890906
Kurtosis-1.08595334
Mean162864079.3
Median Absolute Deviation (MAD)71240000
Skewness-0.02254910338
Sum1.281088848e+14
Variance7.714253157e+15
MonotocityNot monotonic
2020-10-11T21:13:33.256364image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3762349813170.2%
 
98349810710.1%
 
19267349810310.1%
 
1545434989990.1%
 
887734989670.1%
 
1467234989420.1%
 
1052534989350.1%
 
186034989220.1%
 
306334989180.1%
 
295934988820.1%
 
Other values (13559)77661698.7%
 
ValueCountFrequency (%) 
73498120< 0.1%
 
12349837< 0.1%
 
153498193< 0.1%
 
173498181< 0.1%
 
19349884< 0.1%
 
ValueCountFrequency (%) 
3404534981< 0.1%
 
3400934982< 0.1%
 
3400334981< 0.1%
 
3399834982< 0.1%
 
3399134981< 0.1%
 

city_id
Real number (ℝ≥0)

Distinct3749
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47179.7505
Minimum230
Maximum100205
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-11T21:13:33.646728image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum230
5-th percentile10346
Q124799
median46467
Q367886
95-th percentile89749
Maximum100205
Range99975
Interquartile range (IQR)43087

Descriptive statistics

Standard deviation25904.63056
Coefficient of variation (CV)0.5490624747
Kurtosis-1.018564164
Mean47179.7505
Median Absolute Deviation (MAD)21419
Skewness0.05185593619
Sum3.711159174e+10
Variance671049884.7
MonotocityNot monotonic
2020-10-11T21:13:34.081444image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
103468665411.0%
 
20326362104.6%
 
80562341004.3%
 
50898216272.7%
 
40441167322.1%
 
60537147601.9%
 
44366141191.8%
 
45358112461.4%
 
4334111061.4%
 
90633104491.3%
 
Other values (3739)52959767.3%
 
ValueCountFrequency (%) 
2309930.1%
 
129865190.8%
 
167677< 0.1%
 
168533< 0.1%
 
168918< 0.1%
 
ValueCountFrequency (%) 
1002051< 0.1%
 
1000791< 0.1%
 
1000613< 0.1%
 
10004856< 0.1%
 
999995< 0.1%
 

payment_id
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1668.509077
Minimum1491
Maximum1811
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-11T21:13:34.410297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1491
5-th percentile1523
Q11619
median1619
Q31779
95-th percentile1779
Maximum1811
Range320
Interquartile range (IQR)160

Descriptive statistics

Standard deviation87.19266546
Coefficient of variation (CV)0.05225783105
Kurtosis-1.011622604
Mean1668.509077
Median Absolute Deviation (MAD)0
Skewness0.2658271582
Sum1312449240
Variance7602.56091
MonotocityNot monotonic
2020-10-11T21:13:34.614332image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
161947660060.6%
 
177923413329.8%
 
1491364974.6%
 
1811344924.4%
 
152348780.6%
 
ValueCountFrequency (%) 
1491364974.6%
 
152348780.6%
 
161947660060.6%
 
177923413329.8%
 
1811344924.4%
 
ValueCountFrequency (%) 
1811344924.4%
 
177923413329.8%
 
161947660060.6%
 
152348780.6%
 
1491364974.6%
 

platform_id
Real number (ℝ≥0)

SKEWED

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29868.52938
Minimum525
Maximum30423
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-11T21:13:34.879104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum525
5-th percentile29463
Q129463
median29815
Q330231
95-th percentile30359
Maximum30423
Range29898
Interquartile range (IQR)768

Descriptive statistics

Standard deviation1160.893265
Coefficient of variation (CV)0.03886677012
Kurtosis565.3036862
Mean29868.52938
Median Absolute Deviation (MAD)352
Skewness-22.53663783
Sum2.349458521e+10
Variance1347673.174
MonotocityNot monotonic
2020-10-11T21:13:35.192574image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
2946324152330.7%
 
3023121672627.6%
 
2981515897220.2%
 
3035910365313.2%
 
30391244343.1%
 
29751193212.5%
 
29495111511.4%
 
3042368190.9%
 
3019920790.3%
 
52510940.1%
 
Other values (4)8280.1%
 
ValueCountFrequency (%) 
52510940.1%
 
221673< 0.1%
 
22263232< 0.1%
 
222951< 0.1%
 
2946324152330.7%
 
ValueCountFrequency (%) 
3042368190.9%
 
30391244343.1%
 
3035910365313.2%
 
3023121672627.6%
 
3019920790.3%
 

transmission_id
Real number (ℝ≥0)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4253.246112
Minimum212
Maximum21124
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-11T21:13:35.545996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum212
5-th percentile4228
Q14228
median4324
Q34356
95-th percentile4356
Maximum21124
Range20912
Interquartile range (IQR)128

Descriptive statistics

Standard deviation572.8556657
Coefficient of variation (CV)0.1346866959
Kurtosis176.6261099
Mean4253.246112
Median Absolute Deviation (MAD)32
Skewness-0.9114324558
Sum3345603392
Variance328163.6137
MonotocityNot monotonic
2020-10-11T21:13:35.792953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
435634173443.4%
 
432420366825.9%
 
422820161725.6%
 
4260145381.8%
 
212126761.6%
 
499667370.9%
 
419652760.7%
 
1988207< 0.1%
 
21124146< 0.1%
 
20201< 0.1%
 
ValueCountFrequency (%) 
212126761.6%
 
1988207< 0.1%
 
20201< 0.1%
 
419652760.7%
 
422820161725.6%
 
ValueCountFrequency (%) 
21124146< 0.1%
 
499667370.9%
 
435634173443.4%
 
432420366825.9%
 
4260145381.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
1
408889 
0
377711 
ValueCountFrequency (%) 
140888952.0%
 
037771148.0%
 
2020-10-11T21:13:36.004171image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-10-11T21:11:55.123914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:11:55.993606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:11:56.796344image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:11:57.556724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:11:58.420937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:11:59.248021image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:00.037805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:00.758543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:01.628231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:02.331112image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:03.149927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:03.998968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:04.845675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:06.111968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:06.866647image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:07.586572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:08.500681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:09.400836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:10.310772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:11.013509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:11.835606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:12.579970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:13.443132image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:14.132924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:14.907045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:15.552099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:16.294132image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:16.993053image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:17.689904image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:18.374899image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:19.157857image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:19.913443image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:20.754198image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:21.698330image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:22.679188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:23.531593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:24.317123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:25.081449image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:25.846767image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:26.675414image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:27.559579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:28.563075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:29.305920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:30.006470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:30.653386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:31.340787image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:32.000511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:32.885964image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:33.725929image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:34.503016image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:35.343032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:36.030405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:36.750948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:37.475281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:38.222739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:38.907294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:39.685924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:40.385222image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:41.100079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:41.810375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:42.599387image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:43.334016image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:44.063289image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:44.740310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:45.414225image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:46.043723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:46.769564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:47.407040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:48.062001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:48.659832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:49.287980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:49.936984image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:50.614771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:51.244140image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:51.895649image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:52.585313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:53.301376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:54.227615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:54.924836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:55.652474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:56.367554image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:57.305011image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:58.023172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:58.707197image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:12:59.402987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:00.106194image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:00.825055image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:01.505684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:02.180379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:02.850231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:03.545477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:04.139363image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:04.832008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:05.533210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:06.179070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:06.811905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:07.544077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:08.156334image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:08.843499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:09.723734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-10-11T21:13:36.233111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-11T21:13:36.841833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-11T21:13:37.435363image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-11T21:13:38.018945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-10-11T21:13:12.009340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:15.177431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-11T21:13:22.354608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
0000097eabfd92015-06-20191.000.00.00011.4696058034982032617793023143560
10000e2c6d9be2016-01-29201.000.00.0009.558002393034987654716193035943560
2000133bb597f2017-02-26191.000.00.4935.936582064634983383316193035943241
300018269939b2017-02-05171.000.00.4939.82350366134989931516193035943560
40001a00468a62015-08-04191.000.00.4935.150702258534981645616192946343560
50001d9036b5e2015-08-29191.000.00.00011.947501936434988827616192946343560
60001d9036b5e2017-01-04172.000.00.00011.151001936434988827616192946343560
70001d9036b5e2017-01-28163.000.00.0009.717301936434988827616193035943560
80001e1e04d7d2015-10-24191.000.00.00025.222501448334984535816192946343561
90001e1e04d7d2016-03-24192.000.00.0009.29250959534984535816192946343241

Last rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
786590fffcf45e5c692016-11-19121.000.00.000012.531601074634983933516192946343560
786591fffcf45e5c692017-02-04122.000.00.000011.575801074634983933516193035943560
786592fffd696eaedd2015-09-14121.000.01.429724.13395953234988056217792946343560
786593fffe9d5a8d412016-07-3121NaN10.00.00008.44290156133498103461811294632121
786594fffe9d5a8d412016-09-30201.000.00.000010.726209834981034617792946342281
786595fffe9d5a8d412016-09-3020NaN10.00.000010.72620983498103461779294632121
786596ffff347c3cfa2016-08-17211.000.00.00007.59330528934984197816193035943561
786597ffff347c3cfa2016-09-15212.000.00.00005.947201646534984197816193035943561
786598ffff4519b52d2016-04-02191.000.00.000021.77100163634988056214912975142280
786599ffffccbfc8a42015-05-30201.000.00.000016.461001502934984595216192946343240